Skip to content

WIP: [NOT READY FOR REVIEW] build wheels with CUDA 13.0.x, test wheels against mix of CTK versions#2270

Draft
jameslamb wants to merge 11 commits intorapidsai:mainfrom
jameslamb:test-older-ctk
Draft

WIP: [NOT READY FOR REVIEW] build wheels with CUDA 13.0.x, test wheels against mix of CTK versions#2270
jameslamb wants to merge 11 commits intorapidsai:mainfrom
jameslamb:test-older-ctk

Conversation

@jameslamb
Copy link
Member

@jameslamb jameslamb commented Mar 3, 2026

Description

Contributes to rapidsai/build-planning#257

  • builds CUDA 13 wheels with the 13.0 CTK

Contributes to rapidsai/build-planning#256

  • updates wheel tests to cover a range of CTK versions (we previously, accidentally, were only testing the latest 12.x and 13.x)

Other changes

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@jameslamb jameslamb added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Mar 3, 2026
@copy-pr-bot

This comment was marked as resolved.

@coderabbitai

This comment was marked as off-topic.

coderabbitai[bot]

This comment was marked as resolved.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ci/test_wheel_integrations.sh`:
- Around line 7-13: Replace the floating branch plus fixed /tmp clone path in
the git clone step: instead of cloning the mutable branch
"generate-pip-constraints" into "/tmp/gha-tools", pin the clone to an immutable
ref (a commit SHA or tag) and clone into a unique temporary directory (e.g.,
created with mktemp -d) so CI runs are reproducible; update the subsequent PATH
export that references /tmp/gha-tools to point to the new temp-dir variable and
ensure the script handles an existing directory by removing or updating it
before cloning.

In `@dependencies.yaml`:
- Around line 457-480: The CUDA 12.2 matrix pins torch==2.4.0+cu124 and uses
--extra-index-url=https://download.pytorch.org/whl/cu124 but those wheels do not
exist for Python 3.13/3.14; update the CUDA "12.2" matrix entry by either (a)
bumping the pinned package from torch==2.4.0+cu124 to a newer torch release that
provides cp313/cp314 wheels, or (b) remove the cu124-specific index and version
pin (the --extra-index-url entry and torch==2.4.0+cu124) so the matrix falls
back to a supported wheel (or match the same torch/version used in the other
cuda matrices like torch==2.9.0+cu129), ensuring the matrix with cuda: "12.2" no
longer references torch==2.4.0+cu124 or the cu124 index.
- Around line 279-282: The YAML matrix entry that sets use_cuda_wheels: "false"
currently has a bare `packages:` (null); change that `packages:` to an explicit
empty list `packages: []` for the matrix block where `use_cuda_wheels: "false"`
so the `packages` key is a list (not null) and avoids type validation issues.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ee91d2 and aa6b5f4.

📒 Files selected for processing (2)
  • ci/test_wheel_integrations.sh
  • dependencies.yaml

@jameslamb
Copy link
Member Author

/ok to test

-d "${TORCH_WHEEL_DIR}" \
--constraint "${PIP_CONSTRAINT}" \
--constraint ./torch-constraints.txt \
'torch'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just picking a place on the diff to have a threaded conversation.

Here's an interesting one... on the Python 3.14 + CUDA 13.1.1 + latest dependencies jobs (arm64 and amd64), the solve is falling back to numba-cuda==24.0, which has an sdist but no Python 3.14 wheels, which is leading to it being built from source and that build failing!

    Downloading http://pip-cache.local.gha-runners.nvidia.com/packages/04/51/8935ff9ae5150e1ffed945bf1b95002a6a5e1f9256aeb1143e1c159b68c5/numba_cuda-0.24.0.tar.gz (1.3 MB)
...
    Installing build dependencies: started
    Running command installing build dependencies for numba-cuda
...
  Building wheels for collected packages: numba-cuda
...
    g++ -fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall -fPIC -I/tmp/pip-build-env-y5e9zevx/overlay/lib/python3.14/site-packages/numpy/_core/include -Inumba_cuda/numba/cuda/cext -I/pyenv/versions/3.14.3/include/python3.14 -c numba_cuda/numba/cuda/cext/_dispatcher.cpp -o build/temp.linux-x86_64-cpython-314/numba_cuda/numba/cuda/cext/_dispatcher.o -std=c++11
    numba_cuda/numba/cuda/cext/_dispatcher.cpp:1018:2: error: #error "Python minor version is not supported."
     1018 | #error "Python minor version is not supported."
...
    Building wheel for numba-cuda (pyproject.toml): finished with status 'error'
    ERROR: Failed building wheel for numba-cuda
  Failed to build numba-cuda
  error: failed-wheel-build-for-install

(build link)

numba-cuda 0.26.0 was the first version with Python 3.14 wheels... something must be holding the solver back from using that.

Copy link
Member Author

@jameslamb jameslamb Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhhh there it is. torch is == pinning to cuda-bindings and cuda-pathfinder

...
Collecting cuda-bindings==13.0.3 (from torch==2.10.0+cu130)
Collecting cuda-pathfinder~=1.1 (from cuda-bindings==13.0.3->torch==2.10.0+cu130)
...

Later numba-cuda[cu13] needs at least cuda-pathfinder>=1.3.1.

$ docker run --rm -it python:3.14 bash
$ pip install pkginfo
$ pip download --no-deps 'numba-cuda==0.26.0'
$ pkginfo --json ./numba_cuda*.whl
...
  "requires_dist": [
    "numba>=0.60.0",
    "cuda-bindings<14.0.0,>=12.9.1",
    "cuda-core<1.0.0,>=0.5.1",
    "packaging",
    "cuda-bindings<13.0.0,>=12.9.1; extra == \"cu12\"",
    "cuda-pathfinder<2.0.0,>=1.3.1; extra == \"cu12\"",
    "cuda-toolkit[cccl,cudart,nvcc,nvjitlink,nvrtc]==12.*; extra == \"cu12\"",
    "cuda-bindings==13.*; extra == \"cu13\"",
    "cuda-pathfinder<2.0.0,>=1.3.1; extra == \"cu13\"",
    "cuda-toolkit[cccl,cudart,nvjitlink,nvrtc,nvvm]==13.*; extra == \"cu13\""
  ],
...

numba-cuda[cu13]==24.0 doesn't constraint cuda-pathfinder

$ pip download --no-deps 'numba-cuda==0.24.0'
$ pkginfo --json ./numba_cuda-0.24.0*.tar.gz
...
  "requires_dist": [
    "numba>=0.60.0",
    "cuda-bindings<14.0.0,>=12.9.1",
    "cuda-core<1.0.0,>=0.3.2",
    "packaging",
    "cuda-bindings<13.0.0,>=12.9.1; extra == \"cu12\"",
    "cuda-core<1.0.0,>=0.3.0; extra == \"cu12\"",
    "cuda-toolkit[cccl,cudart,nvcc,nvjitlink,nvrtc]==12.*; extra == \"cu12\"",
    "cuda-bindings==13.*; extra == \"cu13\"",
    "cuda-core<1.0.0,>=0.3.2; extra == \"cu13\"",
    "cuda-toolkit[cccl,cudart,nvjitlink,nvrtc,nvvm]==13.*; extra == \"cu13\""
  ],
...

Looks like that was added here: NVIDIA/numba-cuda#308

It looks like this wasn't caught on earlier PRs because CI fell back to a CPU-only torch 😬

  Collecting torch>=2.10.0 (from -r test-pytorch-requirements.txt (line 4))
    Obtaining dependency information for torch>=2.10.0 from http://pip-cache.local.gha-runners.nvidia.com/packages/69/2b/51e663ff190c9d16d4a8271203b71bc73a16aa7619b9f271a69b9d4a936b/torch-2.10.0-cp314-cp314-manylinux_2_28_aarch64.whl.metadata
    Downloading http://pip-cache.local.gha-runners.nvidia.com/packages/69/2b/51e663ff190c9d16d4a8271203b71bc73a16aa7619b9f271a69b9d4a936b/torch-2.10.0-cp314-cp314-manylinux_2_28_aarch64.whl.metadata (31 kB)

(rmm#3316 - wheel-tests-integration-optional / 13.1.1, 3.14, arm64, ubuntu24.04, l4, latest-driver, latest-deps)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, it goes deeper than this.

Just latest numba-cuda and latest torch are happily installable together on Python 3.14.

pip download \
    --no-deps \
    --index-url https://download.pytorch.org/whl/cu130 \
    'torch==2.10.0+cu130'

pip install \
    --prefer-binary \
    ./torch-*.whl \
    'numba-cuda[cu13]>=0.22.1'

# Successfully installed ... cuda-bindings-13.0.3 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-toolkit-13.0.2 ... numba-0.64.0 numba-cuda-0.28.2 ... nvidia-cublas-13.1.0.3 nvidia-cuda-cccl-13.0.85 ... torch-2.10.0+cu130

I think the problem looks like this:

  1. torch-2.10+cu130 depends on a bunch of nvidia-{thing}=={version-from-CTK-13.0.2} wheels
  2. newer numba-cuda depends on cuda-toolkit[cccl,cudart,nvrtc,nvvm]==13.* (since Set up a new VM-based CI infrastructure  NVIDIA/numba-cuda#604)
  3. in this CI job, we're constraining to cuda-toolkit==13.1.*
  4. the solver backtracks to numba-cuda==24.0 (which didn't have cuda-toolkit pinnings, and whose nvidia-nccl and similar dependencies are compatible with torch-2.10's)
  5. numba-cuda==24.0 didn't have wheels for Python 3.14, so pip tries to build it from source
  6. that build from source fails with the error above that basically means "this doesn't support Python 3.14"

What we really want here is a big loud solver error that says "torch-2.10+cu130 only works with the packages pinned in cuda-toolkit==13.0.2, not installable here".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright on the latest build, here's what happened.

Looking at the most recent CI run (build link)

All of these environments look like I'd expect them too and show we're covering a wide range of CTK versions.

NOTE: we end up using CTK 12.4 in the arm64 jobs with RAPIDS_CUDA_VERSION=12.2.2, because there weren't aarch64 cuBLAS wheels for earlier CTKs. CUDA 12.2 will have to be tested in nightlies (I'll do that next on this PR and add a follow-up comment).

Regular wheel tests

wheel-tests / 12.2.2, 3.11, arm64, ubuntu22.04, a100, latest-driver, latest-deps

details (click me)

(link)

Looks exactly like what we want... cuda-toolkit 12.4 (allowed on arm), nvJitLink 12.9, latest numba-cuda

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 cuda-toolkit-12.4.0 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-cu12-12.4.99 nvidia-cuda-nvcc-cu12-12.4.99 nvidia-cuda-nvrtc-cu12-12.4.99 nvidia-cuda-runtime-cu12-12.4.99 nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55

wheel-tests / 12.9.1, 3.11, amd64, ubuntu22.04, l4, latest-driver, oldest-deps

details (click me)

(link)

No cuda-toolkit, oldest numba-cuda (oldest-deps!), nvjitlink 12.9

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 ... numba-cuda-0.22.1 numpy-1.23.5 ... nvidia-nvjitlink-cu12-12.9.86 ...rmm-cu12-26.4.0a55

wheel-tests / 12.9.1, 3.14, amd64, ubuntu24.04, h100, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 12.9 and latest numba-cuda

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 cuda-toolkit-12.9.1... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-cu12-12.9.27 nvidia-cuda-nvcc-cu12-12.9.86 nvidia-cuda-nvrtc-cu12-12.9.86 nvidia-cuda-runtime-cu12-12.9.79 nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55

wheel-tests / 13.0.2, 3.12, amd64, ubuntu24.04, l4, latest-driver, latest-deps

details (click me)

(link)

Looks good, latest numba-cuda, cuda-toolkit 13.0, most CTK libraries from 13.0, nvJitLink from 13.1.

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.0.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.0.85 nvidia-cuda-nvrtc-13.0.88 nvidia-cuda-runtime-13.0.96 ... nvidia-nvjitlink-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests / 13.0.2, 3.12, arm64, rockylinux8, l4, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 13.0 and latest numba-cuda

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.0.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.0.85 nvidia-cuda-nvrtc-13.0.88 nvidia-cuda-runtime-13.0.96 nvidia-nvjitlink-13.1.115 nvidia-nvvm-13.0.88 ... rmm-cu13-26.4.0a55

wheel-tests / 13.1.1, 3.13, amd64, rockylinux8, rtxpro6000, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 13.1 and latest numba-cuda

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 nvidia-nvvm-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests / 13.1.1, 3.14, amd64, ubuntu24.04, rtxpro6000, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 13.1 and latest numba-cuda

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 nvidia-nvvm-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests / 13.1.1, 3.14, arm64, ubuntu24.04, l4, latest-driver, latest-deps

details (click me)

(link)

Looks good, everything from CTK 13.1 and latest numba-cuda

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 nvidia-nvvm-13.1.115 ... rmm-cu13-26.4.0a55

PyTorch / CuPy tests

wheel-tests-integration-optional / 12.2.2, 3.11, arm64, ubuntu22.04, a100, latest-driver, latest-deps

details (click me)

(link)

As expected, PyTorch skipped because this project doesn't test PyTorch versions old enough to run against CTK 12.2

Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found 12.2.2)

CuPy tests pulled in latest numba-cuda, cuda-toolkit 12.4 (intentionally allowed on arm64, it's fine), and the latest 12.x nvjitlink (12.9). This looks like what we want!

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 cuda-toolkit-12.4.0 cupy-cuda12x-14.0.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-cu12-12.4.99 nvidia-cuda-nvcc-cu12-12.4.99 nvidia-cuda-nvrtc-cu12-12.4.99 nvidia-cuda-runtime-cu12-12.4.99 nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55

wheel-tests-integration-optional / 12.9.1, 3.11, amd64, ubuntu22.04, l4, latest-driver, oldest-deps

details (click me)

(link)

For PyTorch tests, no cuda-toolkit installed in the environment, fell all the way back to numba-cuda=0.22.1 (makes sense, oldest-deps!) , used nvidia-* packages from CTK 12.9.

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 ... numba-cuda-0.22.1 numpy-1.23.5 nvidia-cublas-cu12-12.9.1.4 ... nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55 ... torch-2.9.0+cu129 ...

CuPy tests downgraded CuPy to 13.6.0 (makes sense, oldest-deps!) and that brought fastrlock down with it.

Successfully installed cupy-cuda12x-13.6.0 fastrlock-0.8.3

wheel-tests-integration-optional / 12.9.1, 3.14, amd64, ubuntu24.04, h100, latest-driver, latest-deps

details (click me)

(build link)

For PyTorch tests, cuda-toolkit 12.9 gets installed. It's the correct version (12.9) and we see the expected versions of CTK libraries, like cuBLAS 12.9 and nvJitLink 12.9.

Successfully installed ... cuda-bindings-12.9.4 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.4 cuda-toolkit-12.9.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cublas-cu12-12.9.1.4 ... nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a55 ... torch-2.10.0+cu129 ...

CuPy tests kept everything in that environment and just added CuPy

Successfully installed cupy-cuda12x-14.0.1

wheel-tests-integration-optional / 13.0.2, 3.12, amd64, ubuntu24.04, l4, latest-driver, latest-deps

details (click me)

(link)

For PyTorch tests, no cuda-toolkit installed in the environment, but latest numba-cuda (0.28.2) and used CTK 13.0 packages (e.g. cuBLAS 13.1, nvJitLink 13.08). See https://docs.nvidia.com/cuda/archive/13.0.2/cuda-toolkit-release-notes/index.html toconfirm those versions.

Successfully installed ... cuda-bindings-13.0.3 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.0.3 cuda-toolkit-13.0.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cublas-13.1.0.3 ... nvidia-nvjitlink-13.0.88 ... rmm-cu13-26.4.0a55 ... torch-2.10.0+cu130 ...

CuPy tests kept everything in that environment and just added CuPy

Successfully installed cupy-cuda13x-14.0.1

wheel-tests-integration-optional / 13.0.2, 3.12, arm64, rockylinux8, l4, latest-driver, latest-deps

details (click me)

(link)

PyTorch tests pulled in cuda-toolkit==13.0.2 and CTK 13.0 libraries (including nvJitLink 13.0)

Successfully installed ... cuda-bindings-13.0.3 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.0.3 cuda-toolkit-13.0.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cublas-13.1.0.3 ... nvidia-nvjitlink-13.0.88 ... rmm-cu13-26.4.0a55 ... torch-2.10.0+cu130 ...

CuPy tests kept everything in that environment and just added CuPy

Successfully installed cupy-cuda13x-14.0.1

wheel-tests-integration-optional / 13.1.1, 3.13, amd64, rockylinux8, rtxpro6000, latest-driver, latest-deps

details (click me)

(link)

As expected, skipped because there aren't PyTorch wheels support CUDA 13.1 yet.

Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found 13.1.1) 

CuPy tests pulled in cuda-toolkit 13.1, latest numba-cuda (0.28.2), and corresponding nvidia-* libraries.

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1 cupy-cuda13x-14.0.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests-integration-optional / 13.1.1, 3.14, amd64, ubuntu24.04, rtxpro6000, latest-driver, latest-deps

details (click me)

(link)

As expected, skipped because there aren't PyTorch wheels support CUDA 13.1 yet.

Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found 13.1.1) 

CuPy tests pulled in cuda-toolkit 13.1, latest numba-cuda (0.28.2), and corresponding nvidia-* libraries.

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1 cupy-cuda13x-14.0.1 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 ... rmm-cu13-26.4.0a55

wheel-tests-integration-optional / 13.1.1, 3.14, arm64, ubuntu24.04, l4, latest-driver, latest-deps

details (click me)

(link)

As expected, skipped because there aren't PyTorch wheels support CUDA 13.1 yet.

Skipping PyTorch tests (requires CUDA 12.6-12.9 or 13.0, found 13.1.1) 

CuPy tests pulled in cuda-toolkit 13.1, latest numba-cuda (0.28.2), and corresponding nvidia-* libraries.

Successfully installed ... cuda-bindings-13.1.1 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-13.1.1 cuda-toolkit-13.1.1 cupy-cuda13x-14.0.1... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-13.1.115 nvidia-cuda-nvrtc-13.1.115 nvidia-cuda-runtime-13.1.80 nvidia-nvjitlink-13.1.115 ... rmm-cu13-26.4.0a55

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(updates from testing with nightly matrix)

wheel-tests / 12.2.2, 3.11, amd64, ubuntu22.04, v100, earliest-driver, latest-deps

(link)

Looks exactly like what we want... cuda-toolkit 12.2 (allowed on arm), 12.2 versions of most CTK libraries, nvJitLink 12.9, latest numba-cuda.

Successfully installed ... cuda-bindings-12.9.5 cuda-core-0.6.0 cuda-pathfinder-1.4.0 cuda-python-12.9.5 cuda-toolkit-12.2.2 ... numba-cuda-0.28.2 numpy-2.4.2 nvidia-cuda-cccl-cu12-12.2.140 nvidia-cuda-nvcc-cu12-12.2.140 nvidia-cuda-nvrtc-cu12-12.2.140 nvidia-cuda-runtime-cu12-12.2.140 nvidia-nvjitlink-cu12-12.9.86 ... rmm-cu12-26.4.0a57

coderabbitai[bot]

This comment was marked as resolved.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
ci/test_wheel_integrations.sh (1)

7-13: ⚠️ Potential issue | 🟠 Major

Pin gha-tools to an immutable ref and avoid fixed /tmp clone path before merge.

This still uses a mutable branch and a shared absolute path, which makes runs less reproducible and can fail if the path already exists.

Suggested hardening
 # TODO(jameslamb): revert before merging
-git clone --branch generate-pip-constraints \
-    https://github.com/rapidsai/gha-tools.git \
-    /tmp/gha-tools
-
-export PATH="/tmp/gha-tools/tools:${PATH}"
+GHA_TOOLS_DIR="$(mktemp -d)"
+GHA_TOOLS_REF="${GHA_TOOLS_REF:-<pin-a-tag-or-commit-sha>}"
+git clone --depth 1 https://github.com/rapidsai/gha-tools.git "${GHA_TOOLS_DIR}"
+git -C "${GHA_TOOLS_DIR}" checkout "${GHA_TOOLS_REF}"
+
+export PATH="${GHA_TOOLS_DIR}/tools:${PATH}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ci/test_wheel_integrations.sh` around lines 7 - 13, Replace the mutable
branch clone and fixed /tmp path: change the git clone call that currently uses
"--branch generate-pip-constraints" and destination "/tmp/gha-tools" to clone a
specific immutable commit SHA (pin to a known commit) and clone into a unique
directory (e.g., a mktemp/mkdir under the job workspace or a generated tempdir)
instead of "/tmp/gha-tools"; then update the export PATH line that references
"/tmp/gha-tools/tools" to point at the new tempdir's tools subdirectory and
ensure the script cleans up the tempdir when finished.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ci/test_wheel_integrations.sh`:
- Line 73: Update the skip log invoked via rapids-logger so the message matches
the actual gate (CUDA must be >=12.6 and <13.1 or 13.0); replace the current
text "Skipping PyTorch tests (requires CUDA <13.1, found
${RAPIDS_CUDA_VERSION})" with a message that states the full allowed range, e.g.
"Skipping PyTorch tests (requires CUDA >=12.6 and <13.1, found
${RAPIDS_CUDA_VERSION})", keeping the rapids-logger call and the
${RAPIDS_CUDA_VERSION} interpolation intact.

---

Duplicate comments:
In `@ci/test_wheel_integrations.sh`:
- Around line 7-13: Replace the mutable branch clone and fixed /tmp path: change
the git clone call that currently uses "--branch generate-pip-constraints" and
destination "/tmp/gha-tools" to clone a specific immutable commit SHA (pin to a
known commit) and clone into a unique directory (e.g., a mktemp/mkdir under the
job workspace or a generated tempdir) instead of "/tmp/gha-tools"; then update
the export PATH line that references "/tmp/gha-tools/tools" to point at the new
tempdir's tools subdirectory and ensure the script cleans up the tempdir when
finished.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4e302f1e-2607-40a8-85db-ed34941bca20

📥 Commits

Reviewing files that changed from the base of the PR and between 66fca02 and ec3e22e.

📒 Files selected for processing (2)
  • ci/test_wheel_integrations.sh
  • dependencies.yaml

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
ci/test_wheel_integrations.sh (1)

7-13: ⚠️ Potential issue | 🟠 Major

Pin gha-tools to an immutable ref and stop cloning into a fixed /tmp path.

Line 8 currently pulls a mutable branch into /tmp/gha-tools, which keeps CI non-reproducible and can fail on path reuse. Please switch to a unique temp dir and a pinned tag/SHA before merge.

Suggested hardening
-# TODO(jameslamb): revert before merging
-git clone --branch generate-pip-constraints \
-    https://github.com/rapidsai/gha-tools.git \
-    /tmp/gha-tools
-
-export PATH="/tmp/gha-tools/tools:${PATH}"
+# TODO(jameslamb): revert before merging
+GHA_TOOLS_DIR="$(mktemp -d)"
+GHA_TOOLS_REF="${GHA_TOOLS_REF:-<pin-a-tag-or-commit-sha>}"
+git clone --depth 1 https://github.com/rapidsai/gha-tools.git "${GHA_TOOLS_DIR}"
+git -C "${GHA_TOOLS_DIR}" checkout "${GHA_TOOLS_REF}"
+
+export PATH="${GHA_TOOLS_DIR}/tools:${PATH}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ci/test_wheel_integrations.sh` around lines 7 - 13, Replace the mutable
branch clone into the fixed /tmp/gha-tools path by cloning a pinned immutable
ref into a unique temporary directory: create a temp dir (e.g., via mktemp -d),
git clone using a specific tag or commit SHA instead of the branch name (replace
"generate-pip-constraints" with the chosen tag/SHA), point PATH at the temp
dir's tools subdir (export PATH="$TEMP_DIR/tools:${PATH}"), and ensure the temp
dir is cleaned up after the script finishes; update the git clone and export
PATH lines accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@ci/test_wheel_integrations.sh`:
- Around line 7-13: Replace the mutable branch clone into the fixed
/tmp/gha-tools path by cloning a pinned immutable ref into a unique temporary
directory: create a temp dir (e.g., via mktemp -d), git clone using a specific
tag or commit SHA instead of the branch name (replace "generate-pip-constraints"
with the chosen tag/SHA), point PATH at the temp dir's tools subdir (export
PATH="$TEMP_DIR/tools:${PATH}"), and ensure the temp dir is cleaned up after the
script finishes; update the git clone and export PATH lines accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 12ddf0fc-8137-45a2-9f85-631dc3000aef

📥 Commits

Reviewing files that changed from the base of the PR and between ec3e22e and b7db610.

📒 Files selected for processing (1)
  • ci/test_wheel_integrations.sh

@jameslamb
Copy link
Member Author

/ok to test

rapids-bot bot pushed a commit to rapidsai/gha-tools that referenced this pull request Mar 5, 2026
Contributes to rapidsai/build-planning#256

`rapids-generate-pip-constraints` currently special-cases `RAPIDS_DEPENDENCIES="latest"` and skips generating constraints in that case.

This will be helpful in rapidsai/build-planning#256, where we want to start constraining `cuda-toolkit` in wheels CI based on the CTK version in the CI image being used.

## Notes for Reviewers

### How I tested this

Looked for projects using this ([GitHub search](https://github.com/search?q=org%3Arapidsai+language%3AShell+%22rapids-generate-pip-constraints%22+AND+NOT+is%3Aarchived+&type=code)) and tested in them.

It's just a few:

* [ ] cudf (rapidsai/cudf#21639)
* [ ] cuml (rapidsai/cuml#7853)
* [ ] dask-cuda (rapidsai/dask-cuda#1632)
* [ ] nvforest (rapidsai/nvforest#62)
* [ ] raft (rapidsai/raft#2971)
* [ ] rmm (rapidsai/rmm#2270)

On all of those, wheels CI jobs worked exactly as expected and without needing any code changes or `dependencies.yaml` updates... so this PR is safe to merge any time.

### Is this safe?

It should be (see "How I tested this").

This is only used to add **constraints** (not requirements), so it shouldn't change our ability to catch problems like "forgot to declare a dependency" in CI.

It WILL increase the risk of `[test]` extras being underspecified. For example, if `cuml[test]` has `scikit-learn>=1.3` and the constraints have `scikit-learn>=1.5`, we might never end up testing `scikit-learn>=1.3,<1.5` (unless it's explicitly accounted for in a `dependencies: "oldest"` block).

The other risk here is that this creates friction because constraints passed to `--constraint` cannot contain extras. So e.g. if you want to depend on `xgboost[dask]`, that cannot be in any of the lists generated by `rapids-generate-pipe-constraints`. I think we can work around that though when we hit those cases.

Overall, I think these are acceptable tradeoffs.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #247
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant